如果請求的數據有時被壓縮,有時不被壓縮,如何使用 pycurl? (how to use pycurl if requested data is sometimes gzipped, sometimes not?)


問題描述

如果請求的數據有時被壓縮,有時不被壓縮,如何使用 pycurl? (how to use pycurl if requested data is sometimes gzipped, sometimes not?)

I'm doing this to fetch some data:

c = pycurl.Curl()
c.setopt(pycurl.ENCODING, 'gzip') 
c.setopt(pycurl.URL, url)
c.setopt(pycurl.TIMEOUT, 10)   
c.setopt(pycurl.FOLLOWLOCATION, True)

xml = StringIO()

c.setopt(pycurl.WRITEFUNCTION, xml.write )

c.perform()
c.close()

My urls are typically of this sort:

http://host/path/to/resource‑foo.xml

Usually I get back 302 pointing to:

http://archive‑host/path/to/resource‑foo.xml.gz

Given that I have set FOLLOWLOCATION, and ENCODING gzip, everything works great.

The problem is, sometimes I have a URL which does not result in a redirect to a gzipped resource.  When this happens, c.perform() throws this error:

pycurl.error: (61, 'Error while processing content unencoding: invalid block type')

Which suggests to me that pycurl is trying to gunzip a resource that is not gzipped.

Is there some way I can instruct pycurl to figure out the response encoding, and gunzip or not as appropriate?  I have played around with using different values for ENCODING, but so far no beans.

The pycurl docs seems to be a little lacking.  :/

thx!

‑‑‑‑‑

參考解法

方法 1:

If worst comes to worst, you could omit the ENCODING 'gzip', set HTTPHEADER to {'Accept‑Encoding' : 'gzip'}, check the response headers for "Content‑Encoding: gzip" and if it's present, gunzip the response yourself.

(by billcPiskvor left the building)

參考文件

  1. how to use pycurl if requested data is sometimes gzipped, sometimes not? (CC BY‑SA 3.0/4.0)

#pycurl #Python #HTTP #libcurl #gzip






相關問題

python中的握手失敗(_ssl.c:590) (HandShake Failure in python(_ssl.c:590))

SmugMug 的變化似乎炸毀了 pysmug (changes at SmugMug appear to have blown up pysmug)

pycurl/curl 不遵循 CURLOPT_TIMEOUT 選項 (pycurl/curl not following the CURLOPT_TIMEOUT option)

需要幫助從 curl 遷移到 pycurl (need help with moving from curl to pycurl)

Tornado 的 AsyncHTTPClient 從 1.2 升級到 2.0 後不再工作 (Tornado's AsyncHTTPClient no longer works after upgrade to 2.0 from 1.2)

PyCurl 替代方案,libcurl 的 pythonic 包裝器? (PyCurl alternative, a pythonic wrapper for libcurl?)

使用 Pycurl 獲取 HTML (Getting HTML with Pycurl)

如果請求的數據有時被壓縮,有時不被壓縮,如何使用 pycurl? (how to use pycurl if requested data is sometimes gzipped, sometimes not?)

在 MacOS 上安裝 pycurl。(鏈接時 ssl 後端(無/其他)與編譯時 ssl 後端(openssl)不同) (Installing pycurl on MacOS. (link-time ssl backend (none/other) is different from compile-time ssl backend (openssl)))

Python 3.7:在 Windows 10 上安裝 pycurl (Python 3.7: pycurl installation on Windows 10)

Windows 機器在 Thonny 上安裝 pycurl 模塊 (Windows machine Installing pycurl module on Thonny)

當 python 線程在網絡調用(HTTPS)中並且發生上下文切換時會發生什麼? (What happens when the python thread is in network call(HTTPS) and the context switch happens?)







留言討論